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Abstract 


Membrane-bound organelles, such as mitochondria and chloroplasts, have played 
a crucial role in the evolution of plant cells. In this study, we investigate the pres- 
ence of heteroplasmy and genomic variation in liverworts, a group of non-vascular 
plants, using nanopore sequencing technology. We selected four liverwort species 
representing different lineages: Riccia fluitans, Apopellia endiviifolia, Aneura pin- 
guis, and Scapania undulata. 


Through nanopore sequencing, we sequenced, assembled, and annotated the 
organellar genomes of selected liverwort species. The plastid genomes of Riccia 
fluitans, Apopellia endiviifolia, Aneura pinguis, and Scapania undulata exhibited 
high conservation with previously published genomes, while the mitogenome of 
Scapania undulata represents the first report for this species. The analysis of the 
liverwort organellar genomes revealed conserved gene content, structure, and 
order. 


We further investigated heteroplasmy within the liverwort species. The plastome 
analysis did not detect structural heteroplasmy, which is observed in some 
angiosperms but seems limited to seed plants. However, in the mitogenomes, 
we found evidence of heteroplasmy in Aneura pinguis, Apopellia endiviifolia, and 
Scapania undulata. The heteroplasmic sites in the mitogenomes were mainly 
represented by substitutions, indels, and short tandem repeat polymorphisms. 
Some of the identified substitutions resembled RNA editing patterns observed in 
liverworts. 


This study highlights the utility of nanopore sequencing for studying organel- 
lar genomes and detecting heteroplasmy in liverworts. The findings expand our 
understanding of organellar genomic variation in non-vascular plants and provide 
insights into the mechanisms underlying heteroplasmy in liverwort mitogenomes. 
Further research is needed to explore the functional significance of heteroplasmy 
and its implications for liverwort evolution and adaptation. 
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1. Introduction 


Membrane-bound organelles such as mitochondria and chlo- 
roplasts are present in plant cells and are believed to have orig- 
inated from endosymbiotic bacteria-like organisms around 
two billion years ago (Palmer & Delwiche, 1998). One sig- 
nificant evidence supporting this theory is the presence of 
organellar genomes, which are double-stranded closed circu- 
lar molecules of DNA that are unique to mitochondria and 
chloroplasts. In most plants, organellar genomes are inher- 
ited uniparentally, meaning that only one parental genome 
is passed down. This inheritance pattern has allowed us to 
establish connections between individuals and species. How- 
ever, there are exceptions to this pattern, as observed in some 
gymnosperms, including Cryptomeria japonica, Larix decidua 
and Larix leptolepis, that exhibit possible biparental inheri- 
tance, while Biota orientalis, Calocedrus decurrens, Picea pun- 
gens, Picea glauca, Pinus contorta x banksiana, Pinus taeda, 
Pseudotsuga mezniesii and Sequoia sempervirens demonstrate 
paternal inheritance (Reboud & Zeyl, 1994). 


Organellar genomes can differ from each other within a 
cell or between cells within an individual, resulting in more 
than one variant of the genome. This phenomenon is called 
heteroplasmy and was initially detected in plastids and has 
been observed in almost all clades of the green plant tree 
of life, as well as in the mitogenomes of both plants and 
animals (Kmiec et al., 2006; Ramsey & Mandel, 2019). Hetero- 
plasmy can result from several mechanisms, including gene 
rearrangements, gene chimeras induced by recombination, 
insertions or deletions, and point mutations. Moreover, 
in cases of biparental inheritance, in which the parents 
have different plastomes or mitogenomes, heteroplasmy 
can also occur in the offspring. Heteroplasmy can manifest 
visibly, such as in the green and white patterns seen in 
variegated leaves, where the green pattern results from healthy 
green plastids and the white pattern arises from diseased or 
genetically distinct colorless plastids (Chiu et al., 1988). 


In the study of the plastid genomes of Phoenix dactylifera 
cultivars, two mechanisms of plastid heteroplasmy are men- 
tioned (Sabir et al., 2014). The first is biparental inheritance, 
which means that the offspring inherit organelles from both 
parents. The second mechanism mentioned is incomplete 
sorting in uniparental inheritance, which is most likely in date 
palms due to their maternal inheritance of the plastid genome. 
Incomplete sorting occurs when plastids are incompletely 
sorted, and the gametes of the parents are heteroplasmic. 
There are, of course, other reasons why heteroplasmy occurs 
in chloroplasts. For example, mutations in some plant species 
of the genus Medicago presented two types of heteroplasmy, 
the first resulting from a single base pair change and the 
second from indels occurring within individuals and between 
species (Johnson & Palmer, 1989). 


The plant mitochondrial genome is more complex than the 
plastid genome. In particular, the vascular plant mitogenome 
is larger than that of animals, e.g., the size of the watermelon 
mitogenome is 2,500 kb, whereas that of animals is between 
16 and 20 kb (Palmer & Herbon, 1987; Ward et al., 1981). 
In addition, the plant mitogenome can vary greatly in struc- 
ture compared to animals, which have circular mitogenomes 
and can be present in circular, linear, or complex molecular 
forms (Kmiec et al., 2006). Heteroplasmy can be caused by 
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independent mutations or biparental inheritance, as in the 
case of plastid genomes (Wolfe & Randle, 2004), or by muta- 
tion, recombination and paternal leakage (Ramsey & Mandel, 
2019). Furthermore, it seems that a transfer of the plastid 
genome to the mitogenome is rather common in vascular 
plants, which is the next cause of the mitochondrial genome 
heteroplasmy (Park et al., 2020; Szandar et al., 2022). 


To detect heteroplasmy in the liverwort organellar genomes, 
we apply the latest nanopore sequencing technology, which 
enables unbiased identification, quantification, and phas- 
ing of haplotypes. The presence of heteroplasmy in the 
liverworts was not studied so far, assuming uniparental 
inheritance (Pacak & Szweykowska-Kulinska, 2003), but 
recent discoveries in vascular plant organellar genomics 
(Lee et al., 2020; Wang & Lanfear, 2019) encourage us to 
undertake this work. To verify the presence of heteroplasmy 
in the liverworts we selected four species representing four 
major lineages, including Riccia fluitans (Marchantiopsida 
- complex thalloid), Apopellia endiviifolia (Pellidae - simple 
thalloid I), Aneura pinguis (Metzgeriidae - simple thalloid II) 
and Scapania undulata (Jungermanniidae - leafy liverworts). 


‘The main aims: sequence, assembly and annotation of organel- 
lar genomes using nanopore technology (i), analyzing liver- 
wort organellar genomes towards the presence of structural 
heteroplasmy (ii), and identification of intraindividual varia- 
tion within plastomes and mitogenomes (iii). 


2. Material and methods 
2.1. Plant material 


Four species of liverworts were collected from Poland with 
an expedition of Riccia fluitans, which was bought from a 
commercial company (Table 1). To exclude the possibility of 
inter-individual variation, collected species were cultured in 
vitro to obtain the required amount of plant material from 
a single individual. Based on literature data, three types of 
media were selected and tested for in vitro cultivation: solid 
half-strength Gamborg’s B5 medium basal salts (Gamborg 
et al., 1968) including organics and vitamins - %GB5 medium 
(Althoff et al., 2022), %GB5 medium containing 100 ug/mL 
cefotaxime - %4%GB5+C medium (Althoff & Zachgo, 2020), 
solid half strength MS medium basal salts (Murashige & 
Skoog, 1962) including LS medium organics and vitamins 
(Linsmaier & Skoog, 1965) - %MS+ovLS medium (Pence 
et al., 2005). The medium on which the best plant growth was 
observed was selected for the study, and minor modifications 
were introduced. Finally, the liverworts grew on the %GB5 
medium with 20 g-I7! sucrose, 8 g-I~! agar-agar and pH 6.0. 
The upper fragments of sterile plants were used as secondary 
explants and were placed on the medium in the form of five 
small clumps separated by approx. 1-2 cm. Plants were grown 
in climate chambers at 24 °C under long-day conditions with 
a 16:8 photoperiod (16 h light; 8 h dark). 


2.2. DNA extraction, library preparation and 
sequencing 


Total DNA was extracted using the modified CTAB proce- 
dure. Fresh 0.2 g of thallus or leaves was grated in liquid 
nitrogen, placed in a 12 mL tube, and flooded with 2.5 mL 
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Table 1 Liverwort species, locality, voucher and organellar genome accession numbers. 


Species/Lineage Locality Herbarium voucher Mitogenome accession Plastome accession 
number/coverage number/coverage 

Aneura pinguis L. - Sichlaniski stream, Murzasichle, | OLS-L-ST1003 OQ884155/202x OQ700932/446x 

simple thalloid II Tatra Mts. Poland 

Apopellia Olczyski stream, Zakopane, OLS-L-ST2001 OR220798/477x OR220796/881 x 

endiviifolia (Dicks.) | Tatra Mts, Poland 

Nebel & D. Quandt 

- simple thalloid I 

Riccia fluitans L. - Commercial OLS-L-C-RF1 OR220799/661x OR220797/1579x 

complex thalloid 

Scapania undulata Lomnica River, Karpacz, OLS-H-SC21091 OR220800/165x NC_061219.1/397x 


(L.) Dumort. - leafy | Sudety Mts, Poland 


of CTAB1 isolation buffer and 150 pL 2-mercaptoethanol. 
Next, 5 uL of RNAse was added at conc. 100 mg/mL. The 
samples were mixed by vortexing and incubated in a water 
bath at 70 °C with frequent stirring. After 2 h, the mix- 
ture was brought to room temperature. Then, 2.5 mL of 
dichloromethane was added and mixed by inverting the tube. 
Next, the tubes were centrifuged for 30 min at 8,000 rpm. 
The supernatant was then transferred to a clean 12 mL tube 
without disturbing the pellet. 5 mL of CTAB2 (precipitation 
buffer) was added and mixed gently by inverting the tube. 
After the precipitate had formed, the tubes were centrifuged 
for 30 min at 8,000 rpm. The supernatant was pipetted except 
for about 1 ml of the mixture from the bottom of the tube, 
which was transferred with any precipitates to a 1.5 mL tube. 
The tubes were centrifuged for 15 min at 14,000 rpm. If 
precipitate on the tube wall was present, the supernatant was 
discarded, and the precipitate was then dissolved in 500 pL 
of 1M NaCl. Next, 500 wL of isopropanol was added and 
mixed gently. After centrifugation for 30 min at 14,000 rpm, 
the supernatant was pipetted off, and the pellet was rinsed 
twice with 500 wL of 70% ethanol. Next, the tubes were 
centrifuged for 10 min at 14,000 rpm, and the supernatant was 
carefully removed. Open tubes were dried in a thermoblock 
at 37 °C, and finally, the pellet was dissolved in 50 uL of 
water by incubation at 37 °C for 1 h. The purity of DNA 
samples was assessed spectrophotometrically using a Cary 60 
spectrophotometer (Agilent). DNA quantity was estimated 
using the Qubit fluorometer and Qubit™ dsDNA BR Assay Kit 
(Invitrogen, Carlsbad, NM, USA). DNA quality was checked 
by capillary electrophoresis using Tapestation (Agilent) with a 
Genomic kit. If required, the extracted DNA was additionally 
cleaned and concentrated using Genomic Purification Kit 
(New England Biolabs, here after NEB) according to the 
manufacturer's protocol. Since total genomic DNA contains 
only 1-5% of mitochondrial and plastid DNA, to enrich this 
fraction, a Microbial DNA enrichment kit (NEB) was used for 
Aneura pinguis, Apopellia endiviifolia and Scapania undulata 
samples. This method enables PCR-free, bead-based target 
enrichment, which is crucial for obtaining deep coverage and 
high-quality assemblies of organellar genomes. 


The long-read libraries were constructed using Ligation 
Sequencing Kit SQK-LSK114 (Oxford Nanopore Technolo- 
gies, hereafter ONT) and NEBNext® Companion Module for 
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Oxford Nanopore Technologies® Ligation Sequencing (NEB). 
The libraries of Aneura pinguis and Scapania undulata were 
sequenced using MinION MkI1C sequencer and MinION 
R10.4.1 flowcell, while libraries of Riccia fluitans and Apopellia 
endiviifolia were sequenced using Promethion P2 solo device 
and dedicated R10.4.1 flowcell. To obtain the best quality 
sequences, the generated raw reads were processed with 
duplex_tools 0.2.20 (ONT) to generate pairs of duplex reads, 
enabling over Q30 quality sequences. Reads were basecalled 
using Dorado 0.1.1 software (ONT) with a super accuracy 
model (SUP) and stereo-duplex mode. 


2.3. Organellar genome assembly and annotation 


Basecalled reads were assembled into contigs using Flye 2.91 
(Kolmogorov et al., 2019) with - meta flag and minimal 
overlap set to 2,000 bp using 120 compute cores and up to 
0.9 Tb RAM. Assembled contigs were mapped to existing 
reference genomes, with the exception of Scapania undulata 
mitogenome, where contigs were mapped onto S. ampliata 
(NC_052751) using minimap2 (Li, 2018). The mapping 
process revealed that for all the samples analyzed, organellar 
genomes were completely assembled during the de novo 
assembly stage. The draft genomes were circularized and 
remapped using minimap2 to correct possible errors and 
calculate mean coverage. Corrected genomes were anno- 
tated using data from previously sequenced plastomes and 
mitogenomes using Geneious Prime 2023. The newly assem- 
bled S. undulata mitogenome was drawn using OGDraw v3.1 
(Greiner et al., 2019). 


2.4. Structural variants and SNP detection 


Analysis of the presence of structural variants of plastomes 
within analyzed species was carried out using the Cp-Hap 
pipeline (Wang & Lanfear, 2019). For each of the assem- 
bled plastomes, a fasta file was prepared, including LSC, 
SSC, and IR regions. The CP-Hap pipeline was run with 
—t (threads) set to 120, —x map-ont, and —d (minimum 
distance of exceeding the first and last conjunctions) set to 
1,000. BAM mapping files from the assembly step were used 
to call SNP using CLAIR3 v.1.0.0 (Zheng et al., 2022) and 
r1041_e82_400bps_sup_v400 model with following parame- 
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ters: —t= 120, —p — min_coverage = 50, —enable_long_indel. 
Since long-read technologies are prone to deletion errors, 
especially within homopolymeric regions, the annotated 
variants of this kind were removed from further analyses. 
The distribution of heteroplasmic SNPs was visualized using 
Circos (Krzywinski et al., 2009). 


3. Results 


The preliminary sequencing runs of liverwort genomic libraries 
using MinION 10.4.1 flow cells resulted in 2% and 5% of 
total Gbp mapped on mitogenome and plastome reference 

genomes, respectively. The enrichment of total genomic 

DNA towards organellar genomes increased these values 

fivefold in the case of mtDNA and 10-fold for cpDNA. Total 

coverage obtained for studied species varied from 165x in 

Scapania undulata (mitogenome) to 1,579x in Riccia fluitans 

(plastome). 


3.1. Characterization of the newly assembled 
organellar genomes 


The sequenced and assembled liverwort plastomes and mitoge- 
nomes share the same structure, gene, and intron content 
as previously published. The newly sequenced chloroplast 
genome of Riccia fluitans spans 121,989 base pairs with a GC 
ratio of 28.9%. The plastome of Riccia fluitans is 10 bp shorter 
than other sequenced specimens from Poland (MT023021) 
and 307 bp longer than the Korean sample (NC_042887), 
with an identity of 99.957% and 99.632%, respectively. Riccia 
fluitans mitochondrial genome (OR220799) is 185,615 base 
pairs in length, which is six bp shorter than the second Polish 
sample (NC_043906) and 25 bp shorter than the Korean 
accession (MN927134). The genome comprises 74 genes, 
including 42 protein-coding genes, three rRNAs, 28 tRNAs, 
and one pseudogene (nad7). The overall GC content of the 
genome is 42.4%. The pairwise identity between newly assem- 
bled samples and those published previously was 99.961% and 
99.972% for MN927134 and NC_04390, respectively. 


The nanopore-sequenced plastome of Scapania undulata was 
reported and described in the previous study (Ciborowski 
et al., 2022). The newly assembled mitochondrial genome is 
the first report of Scapania undulata mitogenome (GenBank 
accession number OR220800), which is 140,940 bp long and 
contains 72 genes (42 protein-coding genes, three rRNAs, and 
27 tRNAs) with overall GC content at 45.0% (Figure 1). The 
gene order of S. undulata mitogenome is the same as that of S. 
ornithopodioides (MK230950) and S. ampliata (NC_052751), 
but these mitogenomes are longer by 1,982 bp and 2,724 bp, 
respectively. The deletions, resulting in the smallest mitoge- 
nome within the genus, were not equally distributed but con- 
centrated in three regions: trnY(GUA)-trnR(ACG) - deletion 
of 753 bp, within nad7 pseudogene - deletion of 1,152 bp and 
trnV (UAC)-trnD(GUC) - deletion of 417 bp. 


The plastid genome of Apopellia endiviifolia sequenced in this 
study corresponds genetically to the typical form and has the 
same gene content and order as previously published, with a 
total length of 120,531 bp, which is the smallest among known 
plastomes of this cryptic species (Grosche et al., 2012; Pauk- 
szto et al., 2023; Sawicki et al., 2021). One hundred twenty-two 
unique genes (taking into account only one copy of inverted 
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repeat regions) were identified in the plastome of 81 protein- 
coding genes, four ribosomal RNAs, 31 transfer RNAs, and 
six ycf genes of an undetermined function. The sequenced 
A. endiviifolia mitogenome was 109,260 bp long and had 
the same gene and intron content as previously published 
mitogenomes of this species. Apopellia mitogenomes are the 
smallest among the liverworts sequenced so far, but they 
contain all protein-coding genes and full intron set known 
from other Jungermanniopsida, including 41 protein-coding 
genes, three rRNAs, and 26 tRNAs (lost of trnR-UCG). 


The analyzed specimen of Aneura pinguis, according to 
molecular diagnostic characters (Baczkiewicz et al., 2017), 
belongs to species A and its organellar genomes have identical 
gene content and order as previously published (Myszczynski 
et al., 2017). Of the 122 unique genes (i.e., including one 
copy of the inverted repeats) identified in 120,802 bp long 
Aneura pinguis plastome, 81 are protein-coding genes, five 
genes of unknown function (ycf genes), four ribosomal RNAs 
and 32 tRNAs. The mitogenome of the analyzed A. pinguis 
sample was 165,319 bp and contained 71 genes, including 
40 protein-coding, the rRNAs, and 28 tRNAs. 


3.2. Intraindividual variation of liverwort plastomes 


Within the plastomes of complex thalloid liverwort, Riccia 
fluitans, and simple thalloids I, Apopellia endiviifolia, no het- 
eroplasmic SNPs were detected. 


Mapping the reads onto the Scapania undulata genome 
revealed 73 SNP (72 unique, one on the second copy of IR). 
Most of them were identified within LSC (60), 12 within SSC, 
and one in the IR region. Among identified SNPs, only four 
indels were found, including two deletions (two and three 
bp long) and two insertions (three and five bp long). Both 
deletions and insertions were located next to each other, 
deletion in psal-pafII and insertion in ndhC-trnV(UAC) 
intergenic spacers. Analysis revealed 69 substitutions, includ- 
ing 29 transversions and 40 transitions. Among the latter, 
28 were located in protein-coding genes (CDS), and half of 
them (14) were non-synonymous. In the case of transver- 
sions, all of the 12 located in CDSs were non-synonymous 
(Figure 2A). Analysis of intraindividual variation of plastome 
within Aneura pinguis revealed three transitions located in 
protein-coding genes (Figure 2B). The non-synonymous 
mutation was found in ch/B (G to A) and synonymous in psaB 
(Gto A) and ycf1 (C to T). In the case of both species, detected 
heteroplasmy was equally distributed along the plastome, 
with the exception of rpsl1-trnA(UGC) region, where no 
infra-individual variation was identified (Figure 3). 


3.3. Intraindividual variation of liverwort mitogenomes 


Similar to the plastome, the analysis of Riccia fluitans mitoge- 
nome did not reveal any instances of heteroplasmy. 


Analysis of intraindividual variation of mitogenome within 
Scapania undulata revealed the presence of 91 polymorphic 
sites, including nine indels and 82 substitutions. Among 
detected indels, two were located in the intronic region of 
cox1 and atp9 genes, while remaining in intergenic regions. 
All indels can be qualified as short sequences repeat polymor- 
phism (SSR), including two mononucleotide variants (but 
differing by 5 bp), six dinucleotides (CT and AT motifs) 
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Figure 1 Mitochondrial genome of Scapania undulata. Genes inside and outside the outer circle are transcribed in 
counterclockwise and clockwise directions, respectively. The genes are color-coded based on their function. The inner circle 


visualizes the G/C content. 


and one trinucleotide variant with AAT motif (Figure 4A). 
The transitions (65) dominated over transversion (17), but 
both types of substitutions were mostly found in non-coding 
regions. In the protein-coding genes, only three transversions 
and seven transitions were identified within sdh3, rpl2 (two 
SNPs), cox2 (two SNPs), rtl, nad9, ccmFC (two SNPs) and 
ccmEN genes. Four of these substitutions found in rpl2, nad9, 
rtl and ccmEN genes were non-synonymous. 


Intraindividual variation of Apopellia endiviifolia mitoge- 
nome comprised mainly indels (475), but three substitution 
events were also detected. All those substitutions were found 
in non-coding regions and can be qualified as multiple 
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nucleotide variants (MNV), causing changes that concern 
five to six adjacent nucleotides (Figure 4B). Among indels, 
42 were identified within CDS, 30 of them cause potential 
frame-shift, while the remaining expand or shorten product 
up to three amino acids. These indels not causing frameshift 
were found across several genes, including rp/2, rpl5, rps3, 
rps7, rps19, cob, ccmFC, ccmC and sdh4. Altogether, only 
19 protein-coding genes (atp6-9; cox1-3; nad2-5; rpsl, 8, 
11-14; rpl10, 16) remained unaffected by intraindividual 
indels. Besides typical SSR polymorphism, including mono-, 
di-, tri-, and tetranucleotide, the character of identified indels 
is very specific - most of them are 7-14 nt long sequences that 
are duplicates or triplicates of adjacent regions. 
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TATTATGTAGTTTTGCAAAA: TGGCTTTTC--BGT- -TATGGGTTATCCGGCGGTGABACCARTETACAAAGAATTATAMST--TA--TCCHATACAC-AATQICAAAT--TAC - GGAATGTTTATCG---TT 
TATTATGTA-BITTTGC -AAAAAGGATGTTC ~ATCAAATGAAGCTGCTATGAAATATTTACTTATAGGTGGGTTAAGTTCATCTA: AMAMAATTATAAATGGTATTTCCGATACACAAATGTCAAATTIETACABGAATGTTTATCGCATTT 
TATTATGTAGTTTTGC--AAAAGGATGTTCGATC- AATGAAGCTGCTATGAAATATTTACTTATABGT GGGTTAAGTTCATCTAT TATTATATATGGCTTTTC-TGGTTATAT: TETCAAGT TER ABAGGRATORTTATCGCATTT 


TATTATGHAGTTTTG-HAAAAAGGATGTT ‘CGATC- AATGAAGCTG@TATG-AATATTTACTTATAGGTGGGTTAAGTTCATCT: ‘ATTATTATATATGGCTT TTCTTC ‘GGA TATGGGTTATCCGGEGGTGAGACCAATATACAAAGAATTATAAATGGT/ "ATTTCCGATACACAAATGTCAAATTCTACAGGAATGTTTATCGCATTT 
‘TATTATGTAGTTTTGC-~ AAAAGGATGTTCGATCAAATGAAGCTGCTATGAAATATTTACTTATAGGTGGGTTAGHTTCATCTAT TATTATATATGGCTTTTCTTGGTHATATGGGTTATCCGGCGGTGAGACCAATATACAAAGAATTATAAATGGTATTT CCGATACACAAATGTCAAATT-TACAGGAATGTTTATCGCATTT 
TATTATGTAGTTTTG@AAAAAAGGATGTTCGATCAAATGAAGCTGCATGAAATATTTACTTATAGGTGGGTTAAGTTCATCTATTATTATATATGGCTTTTCTTGGIBA TATGGGTTATCCGGCGGTGAGACCAATATACAAAGAATTA Q\AATGGTATTTCCGATACACAAATG IE AMABT-TACAGHAAT 

TATTATGTANTTTTGC-AAAAAGGATGTTCGATCAAATGAAGCTGCTATG-AATATTTACTTATAGGTGGGTTAAGTTCATCTAT TATTATATATGGCTTTTCTTGGTTATATGGGTTATCCG6E- GGTGAGACC- AGATACAAAGAATTA RAAATGGT) ATT T CQBATACACAAATGTCAAATT-TACAGGAATGE TT? ‘ATCGCATTT 


TATTATQTAGTTTTGCAAAAAAGGAPGTTEGATCAAATGAAGCTGCTATGAAATATTTACTTATAGGTGGGTTAAGTTCATCTATTAT-~TATATGGCTTTT TGGTATTTCCHATACACAAATGTCAAATTCT-~AGGAATGTTTATEGCATTT 
TABTATGTAGTTTTGCAAAAAAGGATG-TCGATCAAATGAAGCTGCTATGAAATATT TACT TATAGGTGGGTTAAGTTCATCTAT THT TATATATGGCTTTTCTTGGTTATATGGGTTATCCGSCGGTGAGABCAATA: GTATTTCCGATACACAAATGTQ@AAATTC~ACAGGAATGTTTATCGCATTT 
TATTATGTAG = TTTGC ~AAAAAGGARIGTT.C~ATCAAATGAAGCTGCTATGGAATATTTACTTATAGGTGGGTTAAGTTCATCTAT TATTATATATGGCTT TTC TREGGTIBA TAR GGGTTATCCGGCGETGA GCATTT 

CGATCAAATGAAGCHGCTATGAAATATTTACTTATAGGTGGGTTAAGTICATCTAT TAT--TATATGGCTT T Tilt T GRITREAGAR GGGTTATCCGGCGGTGAGACGAATAMACAAAGAATTATAAATGGTATTTCCGATACACAAATGTCAAATTCTACAGGAATGTTTATCGCATTT 
TATTATGTAGTTTTGC---AAAGGATGTTCGATCAAATGAAGCTGCTATGAAATATTTACTTATAGGTGGGTTAAGTTCATCTAT TATTATATATGGCTT TTCTRGGTTATATGGGTTATC CATTT 
TATBATGTAGTTTTGCAAAAAAGGATGTTCGATCAAATGAAGCTGCTARIAAATATTTACTTATAGGTGGGTTAA-TTHATITAT TATTATATATGGCTT ---TTGGTRA TAN GGGTTATCCGGCGGTGAGACCAATATACAAAGAATTATAAATGGTATTTCCGATACACAAATGTCAAATTCTACAGGAATGTTT-~- GHA-TT 
TAR TATGTAGTTTTGC ---AAAGQATG-TCHATC- AATGAAGCTGCTATGAAATATTTACTTATAGGTGGETTAAGTTCATCTAT TATTATATATGGCTT TTCTREIGGTIBA TAT GGGTTATCCGGCGGHGAGACC-AGATACAAAGAATTATAAATGGTA, TTT 


AAGGATGTTC-ATCAAATGA TGGCTTTTCTTGGTTATATGGGTTATCQSCHIGGTGAGACCAATATACAAAGAATTA @AATGGTATTTCCHATACACAAATGTCAAATTCTACAGGAATGTTTATCGCATTT 

AGCTGCTATGAAATATTTACTTATAGGTGGGTEAGET (-ATCTATTATTATATATGGCTTTTC-4EGGTTATATGGGTTATCCGGCGGTGAGACCAATATACAAAGAATTATAAATGGTATTTCCGATACACAAATGTCARATTCTACAGGAATGTTTATCGCATIT 
TATTATGTAGTTTTGCAAAAAAGGAMETTCGATCAAA THAR GG GCTATGAAATATTTACTTATAGGTGGGT CTT TTCTEGGTTATATGGGTTATCCGGCGETGAGACCAATATACAAAGAATTATAAATGGTATTTCCGATACACAAATGTCAAATT-TACAGGAATGTTTATCGCATTT, 
TATTATGTAGTTTTGC---AAAGGATGTTC-ATEEEAATGAAG-T-CTATGAAATATTTACTTATAGGTGGGTTAAGTTCATC-- ~-ATTAT-TATGGC-ITTCTTGGTTATATGGEIT-----~~ GETGAIBACGA -~ATACAAAGAATTA ATCG-ATTT 
TATTATGTAGTTTTGC ---AAAGGATGTT-GATCAAATGAAGCTGCTATGAAATATTTACTTATAGGT GGMTMAAGTTCATCTAT TATTATATATGGCTT TTCTTGGTTA TATGGGTTATECGGCGGTGAGACCAATATACAAAGAATTATAAATGGTATTTCCGATACACAAATGTCAAATTCTACAGGAATGTTTATCGCATTT 
TATTATGTARTTTTGC-AAAAAGGATGTTCHATC- AATGAAGCTGCTATGAAATA\ TTTCTTGGT- -TATGG@}TATCC-GIGGTGAGACCAATATACAAAGAATTATAAATGGTATTTCCGATACACAAATGTCAAATTCTACAGGAATGTTTATCGCATTT 
TATTATGTAGTTTTGC---~AAAGGATGTTCGATC- AATGAAGCTGCTATGAAATATTTAC-TATAGGTGGGTT ~AGTTCATCTAT TATTATATATGGCTT TTCTIEGGTTA TATGGGTTATCCGGCGGTGAGACCAATATACAAAGAATTATAAATGGTATTT CORAT A ACAAATGTCAAATTCTACAGGRATGTTTATIGHATTT 
TATTATGTAG -TTTGCAAAAAAGGATEETTCGATCAAATGAAGCTGCTATGAAATATTTACTTATAGGTGGGTTAAGTICATCTATTA@ATATATGGCTITT: CACAAATGTCAAATTCTAC-GGAATGTTTATCGCATTT 
TABITATGTAGTTTTGCAAAAAAG. GTTAAGTTCATCTAT TATTATATATGGCTT TTCTEIGGTTA TATGGGTTATCCGGCGGT~AGACCAATATACAAAGAATTATAAATGGTATITCCGATACACAAATGTCAAATTCT=-AGGAATGTTTATCGCATTT 


TATTATGTAGTTTTGC: 


TATTATGTAGTTTTGCAAAAAAGGATGTT -GATCAAATGAAGCTGCTATGAAATATTTACTTATAG: TATATGGGTTATCC-GCGGTQRAGACC-AGATACAAAGAATTATAAATGGTARTTTCCGATACA-GAABGTCAAATT-TACAGGAATGTTTATCGCATTT 
THT ATGTAGTTTTGC- AAAAAGGATGTTIEGATCAAATGAAGCTGCTATGAAATATTTACTTATAGGTGGGTTAAGTTCATCTAT TAT -—TATATGGCTTTTCTIEGGTTATATGGGTTATCCGGCGETGA-ACCAATATACAAAGAARITETAAATOBTA-TTCCGATACACAAATGTCAAATTCTACAGGAATGTTTATCGCATTT 
TATTATGTAGTTTTGCAAAAAAGGATGTIIGGATCAAATGAAGCTGCTATGAAATATTTACTTATAGGTGGGTTAAGTTCATCTAT TATTATATATGGCTT TTCTIEGGT~ - TATGGGTTATCCGGCGGTGAGACCAATATACAAAGAATTATAAATGGTATTTCCGATACACAAATGTCAAATTCTACAGGAATGTTTATCGCATTT 


B - heteroplasmy of Aneura pinguis plastome 


25,752 25,758 25,766 25,771 25,779 25,785 25,79225,797 25,804 25,810 25,816 25,822 25,828 25,833 25,839 25,845 25,85125,856 25,862 25,868 25,873 25,881 25,889 25,896 25,903 25,909 25,914 


hIB gene 


= 
fal 


TGACGGCGGTGTATTTGGAAGAAGAATTCGGAATGTCTTATGTATC AACAAC T-CTATGGGAA-TG TERA TEC AGCAAATTEICATTCGACAGATACAGGAACGACTTAATGC &§f GGG ---CTG--C-A-TGAAT---AGAATCAATTAQGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATT -~GAATGTCTTATGTATCAACAACTCCTAT~GGAATTGTAGATACAGCAAATTGCATTCGACAGATACAGGAACGACTTAATGCGTGGGCACCTGTTC--~-GAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAAT KC GGAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAATT GCATTCGACAGATACAGGAAC GACT TAATGCM@ GGGCA=CTGTTCTATTGAAT =AAAGAATC~ABTACGAATCGTACATCGATGAAC 
TGACGGCGGTGMATTTGGAAGAAGAAT IC -GAATGICTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAAT TGCATICGACAGATACAGGAACGAC TTAATGCGIGGGCACCTGTTCTATTGAATAAAAGAAT CAAT TACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTCGGAATGTCTTATGTATCAACAACTCCTAGEIEGAATTGTAGATACAGCAAATTEICATIICGACAGATAGAGGAMMEGACGBAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTCGGAATGTCTTATGTATCAACAAC TCC ~ISGGGAATTGTAGATACAGC- AABIT GCATTCGACAGATACAGGAACGACTTAATGCGTGGGCACCTGTTCTATIGAATAAAAGAATC-ABITANIGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTCGGAATGT- THATGTATCAACAAC TCCTATGGGAATTGBAGATACAGCAAATTGCATTCGACAGAT@§IC A- GAAC ~AC-TAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACQIEGGTGTATTTGGAAGAAGAATT --GAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAAT TGCATTCGACAGATACAGGAACGACTTAATGC ~-GGGCACCTGTTCTATTGAAT-AAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTCGGAATGTCTTATGTATCAACAAC TCCTAT-BIGAATTGTAGATACAGCAAAT TRICATTCGACAGATAC AGGAACGAGT TAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
FGACGGCGGTGTATTTGGAAGAAGAATTC-GAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAAT TGCATTCGACAGATACAGGAACGACTTAATGCGT-- GACCTGTTCTATTGAATAAAAGAAT CAATTACGAATCGTACATCGATGAAG 
TGACGGCGGTGTATTTGGAAGAAGAATT -GG-ABGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAATT GCATTCGACAGATACAGGAACGACTTAATGC GTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGAC-MIGGTGTATTTGG-AGAAGAATTCGGAATGTCTTATGTATCAACAA--CCTAT—~GAATTGTAGATACAGC-AABTEICATTRIGACAGATAC- GG --IGACT S\ATGCSTGGGCACCTGTTCTATTG-T-AAAG-AT—-ATHARIGAATCG-—CA GA THAR 
TGACGGCGGTGTATTT~-~-GAA-AATTCGGAATGTCTTATGTATCAACAACTC ATTCGACAGATACAGGAACGACT - ~BTGC- TGNIGCACCTGTTCT-BTGAAT-AAAGAATCAATTAC GAATCGTACATCGATGAAC 
TGACGGCGGTGTA-----AAGAAGAATTC-GAATGT@& TATGTATCAACAACTCCTATGGGAATTGTAGATACAGCAAATTGCATTCGACAGATACAGGAACGACTTAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGBCGGCGGTGTATTTGGAAGAAGAATTCGGAATGTCTTATGTATCAACAAC ~ CCTATGGGAATTGT~GAT ~~~ GHAAATTGCATTCGACAGATAC AGGAACGACTTAATGCGTGGGCACCTGTT-TAT HGAMMAAAA GAA KC AATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATT - GGAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAATT GCA TEIIGACAGEITAC AGGAACGACTTAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
GTGTATTTGGAAGAAGAATTC ~GAATGTCTTATGTATCAACAACTCCTATGGGAATTGTAGATACAGCAAATTGCABITCGACAGATAC AGGAACGACTTAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGAT 
TGABIGGCGGTGTATTT-GAAGAAGAATTCGGAATGTC-TATGMATCAACAAC TC--A--GGAATTGTAGATACAGCAAAT THICEIT ~~ GACAGATACAGGAACGACT T@ANIGC -GGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATC ~~ ACATCGATMAAC 
TGACGGC GIT GTATTTGSAAQBAG -ATTC GGAATG-- TREA~-BATCAACAAC TCCTATGGGAATTGTAGATACEGCAAATTGCATTCGACAGATAC AGGAACGACTTAATGCGTGGGCACCTGTICTATTGAAT ~AAAGAATC- AGT ACGAATCGTACAT-GATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATT-—GAATGTCTTATGTAT———CAAC TCCTATGGGAATTGTAGATACAGCAAATIGCATT: CTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACA-CGATGABC 
TGACGGCGGTGTATTTGGAAGAAGAATTC ~GAATGTCTTATGTATCAACAAC TCCTAT~GGAATTGTAGAPACAGCAAATT CHART - GACAGATACAGGAACGAC HAATGCGTGGGCARMMEGTTCTATTGAATAAAAGAAMAATTACGAATC--ACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTC-GAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAATT GCATTCGACAGATACAGGAACGACTTAATGCGTGGGCACCTGTTCTATTGAATAAAAGAAT ~AATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTC GGAATGTCTTATGTATCAACAAC TCCTAT- GGAM# TGTAGATACAGCAAATTGCATTCGACAGATAC AGGAACGACTTAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAQBAGAATTCHIG-ATGTCTTATGTATCAACA--BCCTATGGGAATT---GATACAGCAAATTGCATTCGACAGATACA-GAMIC GACTTAATGCGTGGGCACCTGTTCT-BTGAATAAAAGAATC-ABTACGAATCGTACATCGATGAAC 
TGACGGCQET GTATTTG~==GAAGAATTCGGAATGTCTTATGTATCAACAACT =CTATGGGAATTGT = GETACAGCAAATTGCATTCGACEIBATACAGGAAC GACT TAATGCGTGGGCA=CTQITCT AMT GAATAAAAGAATC=ABTACG=ATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTQ@5G-ATGT-BTATGTA IC ABCAAC TCC TIT -BIGABIBTG TAGATACAGCAAATTGCATTCGACAGATACAGGAACGAC T TAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTC -GAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAATTGCATTCGACAGATAC AGGAACGACTTAATGCG-GGGCACCTGTTCTATTGAAT-AAAGAMMCAATTACGAABIIGIACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATTCGGAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGC-AABTEICATTCGACAGATACAGGAACGACTTAATGCGT-BEBMA CHIT GTT ART GAA TAA AAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTG-ATTTGG-AGAAGAATTC GGAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAATT GCATTCGACAGATACAGGAAC GACT TAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTT-GAAGAAGAATT --GAATGTCTTATGTATCAACAAC TCCTABIGGGAATTGTAGATACAGCAAATTGCATTCGACAGATACAGGAACGAC-TAATGCSTQMIGMACCTGTTCTATTGAAT-AAAGAATCABIT ~~ -GAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAATT - -~GAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAATTGCATIMMIGACAGATAC AGGAACGACTTAATGCGTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAMAATTCGGAATGTCTTATGTATCAACAAC TCCTATGGQRA- -G-- GETACAGCAAATTGCATTCGACAGATAC AGGAACGACTTAATGCSTGGGCACCTGTTCTATTGAATAAAAGAATCAATTACGAATCGTACATCGATGAAC 
TGACGGCGGTGTATTTGGAAGAAGAAT IC GGAATGTCTTATGTATCAACAAC TCCTAQEREIMAA TTGT AGATACAGCAAATTEICATTCGACAGATACAGGAACGACTTAATGCGTGGG---CT-TT@ATTGAATAAAAGAATCAATTACGAATCCTACA TGA BAAC 
TGACGGCGGTGTATTTG@AAGRAGAATT - GGAATGTCTTATGTATCAACAAC TCCTATGGGAATTGTAGATACAGCAAAT THICATT GA ---ATACAGGAACGACTTAATGCGTGGGCACCTGTTCEIMGT GAAT-AAAGAATCAATTACGAATCGTACATCGATGAAC 
WANS CBB TATTTGCAAGRAGAATT -IGAATGTC - TATGTATCAACAAC TCCTATGGGAATTGTAGATACABAAATTGCA TMG AMA GATAC AGQRACGAC-TAATGCGNGGGCACCTGTTCTATTGAATAAAAGAATC-ABITACGAATCGTACATCGATGAAC 


Figure 2 Examples of plastid heteroplasmy in Scapania undulata (A) and Aneura pinguis ( 


The detected heteroplasmy within the Aneura pinguis mitoge- 12 substitutions, including ten transitions (equal proportion 
nome represented both indels and substitutions. Most of the of C->T and G->A) and two transversions (both T->A). The 
indels, with the exception of a single “GA’, were of SSR type, latter resulted in one non-synonymous change within the 
including three loci of TA and two CT dinucleotides repeats nadA gene, while the remaining were located within intronic 
(TA), and seven trinucleotide repeats of “GGC’, “CGG” and __ or IGS regions. The distribution of heteroplasmic loci over 
“CGC” motifs. Besides the tetranucleotide SSR motif “CAGG” mitogenome revealed the difference between substitution 
located in the trnA(UGC)-rps10 intergenic spacer, which and indels. While the former were equally distributed, the 
was repeated 7-10 times (Figure 4C), the remaining loci _latter were present mostly in protein-coding regions of the 
differed by one repeat of each motif. The analysis revealed = mitogenome (Figure 5). 


Acta Societatis Botanicorum Poloniae / 2023 / Volume 92 / Article 172516 6 
Publisher: Polish Botanical Society 


Sawicki et al. / Heteroplasmy of liverwort organellar genomes 


Rx 
*y 
ora 
% 0 
Nn 


cosa 
tm-uag | @ 


ahi ee ain 
rpI32 ES ir ie) 
rpl24 [fefefebet 


ndhF 


trnN-GUU 
RAGS 
7 
rm. 


rn23 


Cc 
end? 
ant S 


aoe 


Co 

GP 
Ny. 
NGS 

GP 


o 


wy 
ee ee B 


——~Seapania 


Legend 


Scapania 


Aneura 


pebl, 
set UUG 


chiB 
e® @ 


maou 


psbA 
tmH-GUug 


eq! 
qare 
a 


3 
a 
iS) 
3) 
to) 


Figure 3 The circular presentation of the heteroplasmic regions within the plastid genomes. The scatter dots indicate substitutions 
within Scapania (red) and Aneura (blue) plastomes. The shade larger dots show nonsynonymous substitution within CDS regions. 
The histograms depict the number of deletions (green), insertions (red), and substitutions (blue) identified as a heteroplasmic 


marker. 


4. Discussion 


4.1. Application of nanopore sequencing in liverwort 
organellar genomics 


The conserved structure and gene content of liverwort organel- 
lar genomes are well-known (Dong et al., 2021; Liu et al., 2014; 
Myszczynski et al., 2017; Slipiko et al., 2017), and the results 
of our analysis support this observation. Newly assembled 
organellar genomes have identical gene order, structure, and 
content as previously published genomes assembled using 
short-reads technology. The consensus sequences of organel- 
lar genomes of Riccia fluitans, Aneura pinguis, Apopellia 
endiviifollia, and plastome of Scapania undulata differ by 
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only a few SNPs from previously published (Ciborowski et al., 
2022; Min et al., 2020; Myszczynski et al., 2017; Sawicki et al., 
2020, 2021) and newly assembled mitogenome of Scapania 
undulata share the genomic features with S. ampliata (Choi 
et al., 2021) and S. ornithopodioides (Dong et al., 2019). The 
mitogenomes of liverworts are characterized by low variation 
at a generic level, at least considering their structure, gene, 
and intron content, but this observation is based on limited 
resources since only a few genera were analyzed (Myszczynski 
et al., 2017; Slipiko et al., 2022). A similar situation was also 
observed in liverwort plastomes (Myszczynski et al., 2017; 
Slipiko et al., 2020, 2022) with the exception of Conocephalum, 


where IR expansion was observed in C. salebrosum (Sawicki 
et al., 2020). 
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119205 110,215 110,223 110,233 110,243 110,251 110,258 119,265 110,275 110,282 110,289 110,297 110,308 110,315 110,323 110,329 110,335 110,341 110,348 110,355 119,362 
-NAAKAAAAATCTAATTTATTCTATTACTATATCATTCC CETTE GGA: - 


ACCATAGT GCCE CSETCTAAAAATTTITCE CTT GATCTTTTC 
trnM(CAU) gene 
rons rRNA 


ACCATAGTGCCCGQET-TAAMABTTTTTCCGTTMATC-TTTC AAAAAAAAATCTAATTTATT-TATTACI.TATCATT - - -6-TCGGARMEEAATAATAATAAATTATTTT TATTTTTTTC GCCTATCGGAAGAGGMATTCBAACCCCCGTGTGCGAATCATMATCCCG 
ABCATAGTGCCCGGCTCTAAAAAMMETTTCCGTTGATCTT TEM AAAAAAAAATCTAATTTBNTCTATTACTATATCATTCCGGTTCGGARMMEAATAATAATAAATTATTTT TA=TTTTTECGCCTATGEGAAGAGG GBC GAAMICCCCGTGTGCG -ATCATGATCCCG 
ACCATAGTGCCCGSCTCTAAAA-“@TTTCCGTTGAT-TTTT~ AGAAAAAAATCT: TATTTTTATTTTTTTCGCCTATCEGAAGAGGGATTCGAACCCCCGTGTGCGAATCATGATCCCG 
TCTATTAC=-TATCATTCCGGTTC GCCTATCGGAAGAGGGATT ~ HEABCCCCGTGTGCGAATCATGATCCCG 
TORRTTACTATATCATTCCGGTTCGGA  AATAATAATRAATHAAETTTATTTTTTTCGCCTATCEGAAGAGGGATTCGAACCCCCGTGTGCGAATCA EAT CCCG 


A i mitochondrial heteroplasmy TCTATTACIRTATCATTCCGGTTCGGA- AA MAATAATAAATTATTTTTATTTTTTTCSCCTATCGGAAGAGGGATTCGAACCCCCGT CEE AIT CATGA--CCG 


TCATGATCCCG 
of Scapania undulata TCTATTACTATATCATTCCGGTTCGGAREEEAATAATAATAAATIETTTT TATTTTTTTCGCCTATCGGAAGA--GA-TCHAACCCCCGTGTGCGAATCA THAN CC- 
GTTCGGA--AATAATAATAA-- TART THATTTTTTTCGCCTATCGGAAGAGGGATTCGAACECCCGTGTGCGAATCATSA~-CCd 


TCTATTACTATATCATTCOBGTTCGGARIMEAATAATAATAAATTATTTTTATTTTTTTCGCCTATEGGAAGA--GA-TCGAA~CCCCGTGTSCGAATCA THAT Gc 
TCTATTACTATATCATTCCGGTTCGGARIMEEAATAATAATAAATTATTTTTATTTTTTTCGCCTATCGGAAGAGGMIGT TCGAACCCCCGTETGCGAATCATGATECCG 
ACCATAGTGCCCGGCTCTAAAAA-TTTTCCGTTMATC-TTTC. -AAAAAAAATCTAATTTATTCTATTACTATATCATTCCGETTCGGA —ARTAATAATAAATTATTTTTATTTTTTTCGCCTATIEGGAAGA--GA-TCGAACECCCGTGTGCGAATCETGATCCCG 
ACCACAGEGCCCGGCTCTAAAAATTTTTCCGTTGATCTTTTC AAAMAAAMATQBAATTTATTCTATTACTATATMATTCCGGTIMGHA — AATAATAAT-AATTATTTTTATTTTTTTC GCCTATCHIG-AMAGGGATTCGAACCCCCGTGTGCGAATCATGATCCCG 
ACCATAG TGCCCGGCTCTAAAAATTTTTCCGTTGATCTTTTC ~AAAAAAAATCTAATTTATTCTATTACTATATCATTCCGGTIBGGA — AATAAT-AT-AATHATETTTATTTTTTTCGCCTATCEGAAGAGGGAT TH GRACE CCCGTGTGCGAATCATGA--CCG 
*CCATAGTGCCCGSCTCTAAAAATT TTTCCGTTGATCTTTTC AKAAAAAAATCTAA-TTATTCTATTACTATATCATTCCGGTTCGGA. -- AATAATAATAMRGTABTTT TATTTTTTTCGCCTATCGGAAGA--GA-TCGAACCCCCGTGT--GAATCA TIA TH GG 
ACCATAGTGCCC GET TAAAAATTTTT-~GTTBATCTTTTC AAAAAAAAATCTAATTTATIEEITATTACTATAIBATTCCGGTTCG@A. -- AATAATAATAAATTATTITTATTTTITTCSCCTATCCQRAGAGGGATTCHAACCCCCGTETGCGAATCATGATCCCG 
ACCATAGTGCCCGGCTCHARAATTTTTCCGTTGATCTTTTC AAAAAAABATCTAATTTATTCTATTACTATATCATTCCGGTTCG GARE AATAATAATAAATTATTTT TATTTTTTTCGCCTAT-GGAAGA-GGA-TCBAACCCCCGTGTGCGAATCE TEA TIBCCG 
ACCATAGTGCCCGSCTCTAAAAATTTTTCCGTMGATCTTTTC -- AAAAMMATCTAATTTATTCTATTACTATATCATEEICGGTTCG GARI AATAATAATAAATTATTTTTATTTTTTTCGCCTATCGMRAGAGGGATTC -AACCCCCGTETGCGAATCATGATCCCG 
ACGATAGTGCCCGGCTCTAAAAATT TTTCCGTTGATCTTTTC - ~AAAAAAAATCTAATTTATTCTATTACTATATCATTCCGGTIMGGE - -AA- AATAAT-AATTATTTT TATTTTTTTC GCCTATCGGAAGAGGGATT BAA~CCCCGTGTGCGAATCATGATCCCG 
ACCATAGTGCCCGSCTCTAAARARITTTTCCGTTGATCTT TEM AAAAAAAAATCTAATTTATTEITATBCTATATCATTCQGGTTCGGAMIEMAATAATAATAARTTATTTTTATTTTTTTCGCCTATEGGAAGA--GAT-~ -AAMICICHITGT-INGAATCATGATCCCG 
ACCATAGTGCCCGGCTCTAAAAATT TTTCCGTTGATCTTTTC - -AKAAAAAATC TAATTTATTCTATTACTATATCATT -MGGTTEGGA > - A. 

“CCA TGCCE GGETCTAAAAATTTITCCGTTGATCTTTTC AKAAAAAMT- TAATTTATTCTATTACTATATCATTCCGGTTCGGARIMEAATAATAATAAATTATTTTTATTTTTTTCGCCTAIEGGAAGA---~-TIEGAA-CCCCGTGTGCGAATERB SAT CCCG 
ACCATAGTGCCCGSCTCTAAAAATT TTTCCGTTGATCTTTTC - -AAAAAAAATC TAA TEITA-TCMRTTACTATATCATTCCGGTTCGGA. - -A--AATAMMBAATEA-HETTTATTTTTTTCGCCTATCGGAAGAGGGATTCGAACCCCCGTGT GCGAATCATSATCCCG 
ACCATAGTGCCCGGETCTAAAAATTTTTCCGTHMATCTTTTC. -AAAAAAAATCTAATTTATTCTATTACTATATCATTCCGQITCGGA  AATAATAAT-AATHABTTTHATTTTTTT- GCCTATCESAAGAGGGATTCGARCCCCCGTETGCGAATCATSATCCCG 
ACHATAGTGCCCGGCTCTAAAAATT TTTCCGTTGATCTTTTC AAAAAAAAATCTAATTTATTCTATTACTATATCATTCCGGTTCGGAREEEAATAATAATMAATTA- TTT TATTTTTTTCGCCTATCGGAAGAGGGATTCGAACCCCCGTGTGCGAATCATGATCCOG 
ACCATAGEGCCCGGCT-TAAAA--TTTHCCGTTGATCTTTTC. -- AAAAABAT-TAATIBATTCTATTACTATATHATTCCG6TTEGG AMEE AATAATAATARATTATTTTTA-TTTTTTCGCCTATCEGAAGAGEGA 

~CCATHGTGCCCHIGC THTAAAAATTTTTCCGTTGATGTTTTCHAAAAAAAAATCTAATTTATTCTA SHICTATATCATTCCGGTTCGGA. -- AATAATAATARATTATTTTTATTTTTTTC GCCTATCGGAAGA--GA-TCGAACCCCCGTGT SG AATCATGATCCCG 
--CATAGTG-CCGGCTCTAAAAATTTTTCCGTTGATCTTTTC AAAAAAAAATCTAATTERITEICTATT-C--TATCATTCCGETTCGGA = AATBATIBAT-AATTA-BIBTTATTIEITTTCGCCTATCGGAAGAGGGA-TCGAACCCCCGTGTGCGAATCA TEIGT -- Cll 
~CCATAGTGCCCGGETCTAAAAATTTTTCCGTTGATCTTTTC AAAAAAAAATCTAATTTATTCT AR TACTATATCATTCCGGTTCG GARE AATAATAATAAATT-TTTT TATTTTTTTISGCCTATIGGAAGA-- - AMT -HAACCCCCGTGTGCGAATCATMBATCCCG 


ACCATAGTGCCCGGCTCTAAAAATT TITCCGTTGATCTTT Tl BAAAAAAAATCTAATTTATTCTATTACTATATCAT MGC TTCGGAREEEAATAATAATAAATTATTTTTATTTTTTTCGCCTATCGGAA- --GGATT-GAACCCCCGTETGCGAA TRG TMA TIBCCG 
ACCATAGTGCCCGGCTCT-AAAATTTTTCCGTTGATCTTTTC -AAAAAAAATCTAATTTATTCTATTACTATATCATTCQGETTCG@A © AATAATAAT-GMETTA-METTTATTTTTTTCGCCTATCGGAAGA-GGATT--Hfa-EcicGTeTGCGAA TMA THAT CECE 


94,322 94322 © 94,341 94,351 94,361 94,368 94,377 94385 94,393 94,399 94407 94,415 94,423 94.431 HAO 94,0N9 94,458 94,467 94,472 
AT GTTAGAGKATCAKAGATGATTACGCGTCAT.CTTTCATGGCCTTARCAASET GATTCGGTCTCAGATGCTCTE TGACCCATTCCCTGGGRAT GGGTCAGACTACTTAATAATATAG RRACCAAGAN CTT TTAT TARMAC GAGACCGRAAACTTATAAACTTT S— 


ATGTTAGAGAATCAMAGATGATTACGCGTC ATGTTTCATGGCCTTAACAAGCTGATTGGGTCTCAGATGCTCTCTGACCCABBGMET SGGA ATGGGTCAGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTITGS CGCTTTACGCTTTATTAGTGCA 


ATGTTAGAGAATCAAAGATGATTACGCGTC ATGTTTCATGGCCTTAACAAGCTGATTGGGTCTCAGATGCTCTCTGACCCATTCCCTGGGAATS 
ATGTTAGAGAATCAAAGATGATTACCG=. ~TCATEGCCTTAACAAGCT GATT GSETETCAGATSCTCTCT GAC CCARGIET GGGAATGGGTCAGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTITG CGCTTTACGCTTTATTAGTGCA 


TCTCTGACCCARGIIT CGGAATGGGTC AGACTACTTAATAATATAGAMACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTITS CGCTTTACGCTTTATTAGTCCA 
TETCTGACCCATTECCTEGGAATGGGTC AGACTACTTAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTTT CHRMMMMIIC SCTTTACGCTITATTAGTGCA 

B it h d j lh t | TET CTGACCCATTCCCTGGGAATGGGTCAGACTA~-~~AATAATINGAGARACCAAGAAGTTTTATTARACGAGACCGAAAACTTATAAACTITG CGCTTTACGCTTTATTAGTGCA 
~ MITOCNON ATI Al NETEFOPIASMY frercrcnccensratcmecc--rec--c cactaca=--TAATATACAAACCAAGAAGTTTTATTAAASGAGACCEAAANETTATAANETHG —GGETTTACGETTTATIAGTCEA 


_ — i TCTCTGACCCARIGGEEET GGGAATGGGTCAGACTACTTAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATARACTT TE SccTTTACGEN TTA TEMA 
of Apopellia endiviifolia TCTCTGACCCABGGHIET GGGAATGGGTC AGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTTT GHEMIENCGCTTTACGCTTTATTAGTGCA 
rerersacce 

TCTCTGACCCARBEMEET GGGAATGGGTCAGACT~CTTAA 

TCTCTGACCCATTCCC- GGMAAMIGGGTCAGACTACTTAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAAC TTT GRGREEEC G CTTTACGCTTTATTAGTGCA 


. - - - a MET C - CAGIBATTCCCTGGGAATGGGTC AGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTITS. CGCTTTACGCTTTATTAGTSCA 
ATGTTAGAGAATCAAAGATGATTACGCG “TCATEGCCTTAACAAGCTGATTGGGTCTCAGATGCTCTCTGACCCATTCCCTGGGAATGGGTC AGACTACTTAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTTTS. eseTtT 
ATGTTAGAGAATCAAAGATGATTACGC GTC ATGTTT CATGGCCTTAACAAGCT GATTGGETCTCAGATGCTCTCTGACCCA. 

ATGTTAGAGAATCAMAGATGATTACGCGTCATGT TT CATGGCCTTAACAAGCT GATT GGETCTCAGATGCTCTCTGACCCATTCCCTGGGAATGGGTCAGACTACTTAATAATATAGARACCAAGAAGT TT TATTAAACGAGACCEAAAACT TATAAACT ITS CGCTTTACGCTTTATTAGTGCA 
ATGTTAGAGAATCAAAGATGATTACGCG~ -TCATGGCCTTAACAAGCTGATTGGCTCTQAMATGCTCTCT CAE CC ARGENT GGGAATGGGTCAGACTACTTAATAATATAGARACCAAGAAGTTTTATTARACGAGACCEAAAACTTATAAACTT T GEGEIMICGCTTTACGCTTTATTAGTGCA 
ATGTTAGAGAATCAAAGATGATTACGCG= STCATGGCCTTAACAAGCTGATTGGGTCTCAGATGCTC TCT GAC CC ARGENT G6GAATGGGTCAGACTACTTAATAATATAGARACCAAGAAGTTTTATTARACGAGACCGAAAACTTATARACTITS. CGCTTTACGCTTTATTAGTGCA 


ATGTTAGAGAATCAAAGATGATTACGCGTCATGTTT CATGGCCT TAACAAGCT GATTGGETCTCAGATECTCTCTGACCCA 
ATGTTAGAGAATCAAAGATGATTACGCGTCATGTTTCATGGCCTTAACAAGCTGATTGGGTCTCAGATGCTCTCTGACC GB - ~~“ IITGGGAATGGGTCAGACTAC-TAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTT T GHGINMMICGCTTTACGCTTTATTAGTGCA 
arot CATGGCCTTAACAAGCTGATIIGGGTCT-AGATGCT---TGACCCATTCCCTGGGAATGGGTC AGACTACTTAATAATATAGARACCARGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTTT HGNC GCTTTACGCTTTATTAGTGCA 
ATGTTAGAGAATCAAAGATGATTACACGTCATETTTCATGGCCTTAACAAGCTGATTGGETCTCAGATG™~CTCTGACCCATTCCCTEGGAATGGGTC AGACTACTTAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCCAAAACTTATAMACTTTG CacTTTACOCTTTATTAGT 
ATGTTAGAGAATCAAAGATGATTACGC GTC ~~~ == TCATGGCCTTAACAAGCTGATTGGETCTCAGATGCTCT=TGACCCARRNIT G GGAATGGG1 AGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTTTS CGCTTTACGCTTTATTAGTECA 
TTACGCGTCATGTTTCATEGCCTTAACAAGCT GATTEGETETCAGATECTCTCTGACCCATTECCTGGGAATGGGTC AGACTACTTAATAATAT-- AAACCAAGAAGTTTTATTAMACGAGACCGAAAACTTATAAACT 11 CEMMNMINICSCTTTACGCTTTATTAGTECA 
ATOTTAGAGAATCAAAGATGATTACGE G- ~~~ --TEATGGECTTAACAAGET GATT GSETETCAGATGETCTETGACCCATTCCETGEGAATGGGTC AGACTAC~TAATAATATAGAAACCAAGAAGTTTTATTAAAE GAGA CCGAAAACTTATAAACTTTG CSCTTTACGGTTTATTAGTGEA 
ATGTTAGAGAATCAAAGATGATTACGC GTCATETTTCATGGCCTTAACAAGCTGATTGGETETCAGATGCTCTCTGACCCATTCECTGGGAATGGGTCAGAQSACTIABTHAT- TRGARACCAAG=-G-TTTATTAAACGAGACCGAAAACTTATAAACTTT GIGIMNMICGCTTTACGCTTTATTAGTGCA 
ATGTTAGAGAATCAAAGATGATTACGCGTCATGTTTCATGGCCTTARCAAGE TGATTGGGTCTCAGATGCTCTCTGACCCATTCCCTGGGAATGGGTCAGACTACTTAATAATATAGARACCARGAAGTTTTATTARACGAGACCGAAAACTTATAAACT TTS CGCTTTACGCTTTATTAGTGEA 
ATGTT-~AGAATCAAAGATEATTACGCE— ~TCATEGCCTTARCAAGCTEATTGGETETCAGATGCTC TCT GAC CCAR T GGGAATGGGTC AGACTACTTAATAATATAGARACCAAGAAGTTTTATTAMACGAGACCEAAAACTTATAAACT 11 CRORES CTTTACGCTTTATTAGTGCA 
ATGTTAGAGAATCAAAGATGATTACGC GTC ATGTTTCATGGCCTTAACAAGCTGATTGGETETC CTGGGAATGGGTCAGACTACTTAATAATATAGABACCAAGAAGTTTTATTABACGAGACCGAAAACTTATAMACTTT SGMMEMICGCTTTACGCTTTATTAGTGCA 
ATGTTAGAGAATCAAAGATGATTACGCG~~--~--TCATGGCCTTAACARGCTGATTGGGTCTCAGATGCTCTCTGACCCARMEMIET GGA ATGGGTC AGACTACTTAATAATATAGARACCAAGAAGTTTTATHAAACGAGACCGARAACTTATARACTT TEL Boci@iraclicrrTatTractaca 
-TCAT@GCCTTAACAAGCTGATTGGETCTCAGATGCTC TCT GACCCATTCCCTGGGAATGGGTCAGACTACTTAATAATATAGARACCAAGAAGTTTTATTARACGAGACCGAAAA: 
ATGTTAGAGAATCAAAGATGATTACGC GTC ATGTTT CATGGCCTTAACAAGCTGATTGGETCTCAGATGCTCTCTGACCCARGHIET GGGAATGGGTCAGACHIAC-TAATAATATAGAMACCAAGAAGTTTTATIAAACHIA-ACCHIAAAARTTA-AAACTTT 


ATGTTAGAGAATCAAAGATGATTACGCE~ 


ATGTTAGAGAATCAAAGATGATTACGCG— -TCATGGCETTAACAAGCTGATTGGET-TCAGATGCTMITCT AC CCATTCCCTGGGAATGGGTC AGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAAAC GAGA CCGAAAACTTATAAACTTTG. CGCTTTACGCTTTATTACTGCA, 

‘TEGGAATEGGTCAGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAMACGAGACCGAAAACTTATAAACTTTS CSCTTTACECTTTATTAGTECA 

Aver THAGAAr TOGGTETCAGATECTCTCTEACECATTCECTEGGAATGGGTC AGACTACTTAATAATATAGAAACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACTITG cacTTTACECTTTATFAGTGCA 

ATGTTAGAGAATCAAAGATGAT TACGCGTCATET TT CATGGCETTAACAAGCT GATT GGGTCTCAGATGCTCTCT GAC CCAREEMEET GGGAATGGGTCAGACTACTTAATAATATAGAAACCAAGAAGT TTTATTAAACGAGACCGAAAACT TATAAACT 11 GRRE GCTTTACGCTTTATTAGTGCA 

GGAATGGGTC AGACTACTTAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCGMAAACTTATAAACTITS CGCTTTACGCTTTATTAGTGEA 

ATGTTAGAGAATCAAAGATGATTACGCGTC ATGTTT CATGGCCTTAACAAGCTGATTGGGTCTCAGATGCTCTCTGACCCARGGMIMET GGGAATGGGTC AGACTACTTAATAATATAGARACCAAGAAGTTTTATTAAACGAGACCGAAAACTTATAAACT TT CGC G CTTTACGCTTTATTAGTGCA 

ATGTTAGAGAATCAAAGATGATTACGC G= = === ==TCATGGCCTTARCARGCTGATTEGETCTCAGATGCTC TCT GACCCABEGMIET GGGARTGGGTCAGACTACTTAATAATATAGARACCAAGAAGTTTTATTAMACHIEG AE CE RAAACTTATARACTITS CECTTTACGMETTATTAGTGCA 
42,973.42,978 42,987 42,99242,997 43,006 43,013 43,019 43,025.43,030 43,036 43,044.43,049 43,054 43,058 43,059 43,065 43,073 43,078 43,083. 43,089 43,094 43,100 43,106 43,11143,116 43,122 


| CORACESCATAsATACAGCEATGCNACTTAK GAACCTACc CAKCAACTA GT CREACATECCARECHAABATCAGKTT CEGCECGCER 
GCTACCGCATAGATACAGGCATGGAACTTAAGAACCTACGGAACAAC TAG-MACACATCGGAMEG-AAAATGAGATTGCGGCGGGCA 
GC TACCGCATAGATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTGACACA TE GGAACGAAAAATGAGATTGCGGCEGGCA 
GCTACCGCABEEGIIACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTG = ~ACATCGGAACG-AAAATGAGATTGCGGCGGGCA GECAGGCAGGCAGGCAGCCAGECAGBAGGAATATCTCCTCATCCCCACCAGCC~ GC~GGAGGAAGGGCA 
CERAAAATGAGATTGCGGCGGGCA GECAGGCAGGCARGCAGGCAGECAGGAGGAATATCTCCTCATECECACCA-EC CC oRIEB SG Allis cc 
hc @haaaaTeacaTTeccecescc AiBic IGCCAGGCAGGCAGGCAGECAGGCAGGAGGAATAETCCTCATCCECACCAGCCESCGEGAGGARGGECA 


c itochondrial het | CGAAAAATEAGATTECGGEMGGCA--¢ A cAGECAGGEAGGCAGGCAGECABGAGGA--ATCTCCTCAT-CCCACCAGECEGECEGAGGAAMIGMICA 
MITOCNONALIAl NETEFOPLASMY fecassrare-c--recoccesscs  CRGMNRGNNGERE ccc AGécASGCAGGCAGCCAGECASGAGGAATATCTCCTCATECECACCAGECECCGGGASGAAGGGCA 


GGCABGCAGGCAGGCAGGCAGECAGGAGGAATATCTCCTCAT-CCCACCAGECGGCGGGAGGAAGGGCA 
GGCAGGCAGGCAGGGR.GGCAGGCHIGGAGGAATATCTCCTCATCCCCACC AGECGGCGGGASGAAGGGCA 


c 
c 
c 
c 


G 7 CG-AAAATGAGATTGCGGCGGGCA  GQMINGGMINGGINNG GG AGGCAGGCAGGCAGGCAGGCAGGAGGAATATCTCCTCATCC ECACC AGECGGCGGGAGGAAGGGCA 

of Aneura pinguis Eas dle enyqaconcd idea: € MMUMMMMMMMMEAxh onc hahch coca cheat aeauesavirercercAtce ceaceAceeateaanaaaa clk 
CGAAAAATGAGATTCCHIGCAGCAREIC BcGCAGGCAGGCACGCEGGCAGECAGGAGGAATATCTCCTCATCCCCACCAGCCCECGGGAGGAAGGGCA 

CG=AARATGAGATTECGGCEGGCA C GGCARGCAGGCAGGCAGCRGECAGGAGGAATATCTCCTCATCCCCACCAGCCEGCGGGASGAAGGGCA 

GETACCGCAT. TAGTGACACATCGGAACG=AAAATGAGATTGCGGCGGGCA- - QGIUIMGGIINGEGIING GC AGGCAGGCAGGCAGGCAGECASGAGGAATATCTCCTCATCCCCACCAGCCGGCGGGAGG~-EIGGC- 
GCTACCGCATBGATACAGGCATGGAAC - MBA GAACCTACGGAACAACTAGTGACACATCGGAACGAAAAATGAGATTGCGGCGGGCA —C ECAGGCAGGCAGGCAGGCAGECAGGAGGAATATCTCCTCATCCCCACCAGCCGGCGGGAGGAAGGGCA 


GCTAGEGGAIAGATACASGCATGGABC-~~AGAACCTACGGAACAACTAGTGBCIICATCGGAACGAAAAATGAGATTGCGGCCGGCA  CEIGIINIGGIINGEINN GG CAGGCAGGCAGGCAGGCAGECAGGAGGAATATCTCCTCATCCECACCAGECEGCGGGAGGAAGGGCA 
GCTACCGC-HAGATACAGGCATGGAACTTAAGAACC - ACGGAACAAC TAGTGACACATCGGAACGAAAAATGAGATTGCGGCGGGCA ‘CUGGMIGGIINGIGNN GC AGG CAGGCAGGCAGGCAGECAGGAGGAATATCTCCTCATCCCCACCAGCCGGCGGGAGGAAGGGCA 
GE TACHRCATAGATACHSGCATGGAACTTAAGAACCTACGGAACAAC TASTGACACATCGGAACGAAAMATGAGATTGCOGCHGGCA- - QSMMBUMBMNEENM c Gc AGGCAGGCAGSCAGGCAGECAGGAGGRATATCT-CTCAT~--CACCAGECEGCGGGAMBAAGGGEA 
GCTACCGCATAGATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTGACACATCGGAACGARARATGAGATTCCGGCEGGCA - CRIGMIMGGIINIGEIIN Gc CAGGCAGGCAGGCAGGCAGECAGGAGGAATATCTCCTCATECCCACC AGCCEGCGGGAGGAAGGGCA 


GCTACCSCATAGATACAGGCATGGAACTTAAGRACCTACGGAACAACTA-T-AGRCATCGGAACG-AAAATGAGAT OMGGC-GGCA Cc GGCAGGCAGGCAGGCAGGCAGECAGG AGGGCA 
GCTACCGCATAGATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTGACACATCGGAACG~AAAATGAGATTGCGGCGGGCA- - QGMMIGRINGGINNIG 6 GCA GGCAGGCAGGCAGGCAGECAGGAGGAATATCT=CTCATCCCCACC AGCCGGCGGGAGGAA~ GHA 
ATAGATACAGGCATGGAACTTAAGAACCTACGGAACAAQHAGTG--ACATCGGAACGAAAAATGAGATTGCIIGCGGGCA = BC AGGCAGGCAGGCAGGCAGECAGGAGGAATATCTCCTCATCCCCACE AGCCGGCIIGGAGGAMEGGCA 


~~ - GATACASGCATGGAACTTAAGAACCTACEGAACAAC TAGTGACACATEGEAACG-AAAATGAGATT C&EGCIEGGEABEC IG6CAGECARGCAGCGCAGECAGECAEGA-GAATATCTCCTCAFECECACC AGECESCGEGAGGAAGEGEA 
GAACAACTAGTGACACATCGGAACGAAAAATGAGATTCCCGCEGGCA  CERMMISGMINGEEIN GG CAGGCAGGCAGGCAGGCAGGCAGGAGGAATATCTCCT CACC CCACCAGEC EGE -GGASGAAGGGCA 

GCTA-CG-ATAGATACAGGCATGGAACTT~AGAACC - ARNIGGAACAACTA-TGAGACATCGGAACG-AAAATGAGATTGCGGCGGGCA CC — GMINGGINNIG 6 6c AGG CAGGCAGGCAGGCAGGCAGGAGGAATATCTCCTCATCCCCACCAGECGGC-GGAGGAAGGG-A 

GETACCGCAT-GATACAGGCATGG-IBC- - -AGAACCTACGGAACAACTAGTGACACATCGGAACGAAAAATGARIETTGCGGCGGGCA C GGCAGGCAGGCEGGCAGGCABGCAGGAGGAATAMICTCCTCATCCCCACCAGCCGGCGGGAGGAAGGGCA 

GCTACCAEABA- ATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTGACECATCGGAACGAAARATGAGATTGCGGCEGGEA C GECAGECAGGCAGGCAGCCAGECAGGHEGBATA ET CCHCAT-- CCBCBA-CCGGCEGGASGAAGGGCA 

GETACCGCATAGATACAGGQATGGAACTTAAGAACCTACGGAACAACTEGTGACACATCGGAACGAAAAATGAGATTGCGGCGGG== GGCAGGCAGGCA 

GCTACCIICEIEGG ATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTGACACATCGGAACGAAAAATGAGATTGCGGCGGGCA  C@RMINIGGIINIGIGEIN c 6c AGGCAGGCAGGCAGGCAGGCAGGAGGAATATCTCCTCATCC ECACC AGECGGCGGGAGGAAGGGCA 


GETACCSCATAGATACASGCATGGAACTTAAGAACCTACGGAACAAC TAGTGACACATEGGAACG-AAAATGAGATTGEGGCEGGEA-C GGCABGCASGCAGGCAGECAGGCAGGAGGAMGA- ~~ --TCATECECACC AGE CGGC-GGAGBAREGGEA 
GCTACCGCATEIBATA CAGGCATGGAACTTAAGAACCTACGGAACAAC TAGTGACACATCGGAACGAAAAATGAGATTGCGGCEGGCA GOCAGG@IGGCAGGCAGGHAGECAGGAGGAATATCTCCTCATCCCCACCAGCCGGCGGGAGGAAGGGCA 
GETACEGCATAGATACAGGCATGGAACTTAAGAACETACGGAACAAC TAGTGACACATEGGAACG-AAAATEAGAT AGGGEA 
IGCTACCGCATAGATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTGACACATCGGAACGAAARATGAGATTGCGGCGGGCA [HIGGCAGGCASGCAGGCAGGCAGGCAGGAGGAATATCT-CTCATCCCCACC AGCCGGCGGGAGGAAGGGCA 
GETACCGCATAGATACASGCATGGAACTTAAGAACCTACGG- -- AAC TA-TGACACATCGGAACG-AAAATGAGATTGCGGCMIGCAGEC BIGGCAGGCAGGCAGGCAGGCAGECAGGAGGAATATCT-CTCATCCECACCAGECGGCGGGAGGAAGGGEA 
GCTACCGCATAGATACAGGC -TGGAACTTAAGAACCTACGGAACAABITAGTGACACATCGGAACGAAARAT=~GATTGCGGCEGGCA~-C GGCAGGCABIGCAGGCAGGCAGGCAGGAGGAATATCT~-TCATCCCCAGE AGCCGGCGGGAGGAAGGGC- 
GTACC-BABAGAT AGA SGCATGQAACTTAAGRACCTACGGAACAACHEAGTEACACHTEGHAACGRAAAATCAGATTCCHIG@iiccca - =GCAGGCEGCCAGGCAGGC MGC CAGGAGGAATATCTCCTCATCCCCACCAGCCGGCGGGAGGAAGGGCA 


GETACCECATEIBATACASGCATGGAACTTAAGAACCTACGGAACAACTAGTGACACATCGGAAC 

GCTECCGCATAGATACAGGCATGGAACIIBABGAACCTACGGAACAACTAGTGACACATCRIGRIICGAAAAATGAGATTGCGGCGGGCA CC (BOMNGEN ccc AGG CAdECAGGCAGGCAGECAGGAGGAATETCTCCTCATCCCCACCAGCCGGCGGGAGGAAG---A 
SETAC ACCTACGGAACAACTAGTGACACATEGGAACGAAAAATGAGATTGCIIGCEGGEA~-C GGCAGGCAMBCAGGCAGECAGECASGABGAATEITCTCCTCAT-CCCACE AGECGGCGGGAGGAAGGGEA 
GCTACCOCBBAGATACAGGCATG-~-CTTAAGAACCTACGGAACAACTAGTGACACATEGGAACGAAARATGAGATTCCIIGCIIGGCAc GECAGECAGGCAGGCABCCAGECAGGAGCAATATCTCCTCATCCCCACCAGCCOCCGGGAGGAAGGGCA 
GCHACIESCATAGATACASGCATGGAACTTAAGAACCTACGGAACAAGEAGIBACACATCGGAACG-AAAATGAGATTGCOGCGGGCA C——- GGRINGBIN GC AGG CAGGCAGGCAGGCAGGCAGGAGGAATATCTCCTCATCCCCACC AGE CG GCGGGASGAAGGGCA 
GETACCECATAGATACAGGCATGGAIE TTAAGAACCTACGGAACAACTAG-MACACATEGGAACG“AAAATGAGATTGCGGCEGGCA - CERGMMIBSMINEENN GC AGGCAGGCAGGCAGGCAGECAGGAGGAATATCIECTCATECCCACCA==CGGCGGGAGGAAGGG=B 
GETACCGCATAGATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTG--ACATCGGAACG-AAAATGAGATTGCGGCGGGCA C BGR GCC AGG CAGGCAGGCAGECAGECAGGAGGAATATCTCCTCATCCCCACCAGCCGECGGGA-GAAGG-~~ 
EC TACCECATAGATACAGGCATGGAACTTAAGAACCTACGGAACAACTAGTGACACATEGGAACGAAAAATGAGATTECOGCEGGCA CQBMIISEMINEGEINc 6 AGGCAGGCAGGCAGECAGECASGAGGAATATCEECTCATCCECACCAGCCESCEGGAGGAAGGGCA 


Figure 4 Examples of mitochondrial heteroplasmy in Scapania undulata (A), Apopellia endiviifolia (B) and Aneura pinguis (C). 
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Figure 5 The circular presentation of the heteroplasmic regions within the mitochondrial genome. The scatter dots indicate 


substitutions within Apopelia (green), Scapania (red), and Aneura 


(blue) mitogenomes. The shade larger dots show 


nonsynonymous substitution within regions. The histograms depict the number of deletions (green), insertions (red), an 
ynonym bstituti ithin CDS regi The histog depict th ber of deletions (green), insertions (red), and 


substitutions (blue) identified as heteroplasmic markers. 


In contrast to research on angiosperm chloroplast genomes, 
previous studies on Scapania undulata (Ciborowski et al., 
2022), Apopellia endiviifolia and Pellia epiphylla using long- 
read sequencing (Paukszto et al., 2023) did not detect struc- 
tural heteroplasmy associated with SSC subunit orientation. 
Long-read sequencing revealed a single plastome haplotype 
in gymnosperms and pteridophytes, suggesting that the 
alternative haplotype is specific to angiosperms (Wang & 
Lanfear, 2019). However, the number of liverwort organellar 
genomes assembled using long reads is still too low to exclude 
the possibility of structural variation. Most of the avail- 
able plastomes were assembled using Illumina technology 
(Dong et al., 2021; Myszczynski et al., 2017; Yu et al., 2019), 
which makes detection of structural variants difficult. Recent 
advances in nanopore sequencing, including 10.4.1 pores, 
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duplex reads, and improved basecalling enabled fast and 
high-quality assemblies of plastomes without the need for 
polishing with short reads. Moreover, long-read technologies 
have overcome the shortcomings of short-read technology 
and enabled the assembly of complete organellar genomes 
in species rich in short tandem repeats, like Conocephalum 
(Sawicki et al., 2020), Apopellia and Pellia (Paukszto et al., 
2023). 


4.2. Heteroplasmy of plastid genomes 


Numerous studies have been conducted to explore plastome 
heteroplasmy in vascular plants. Earlier studies applied frag- 
ment length variation of selected plastome regions (Ellis 
et al., 2008; Iida et al., 2007). Additionally, heteroplasmy 
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has also been identified through SNP analysis obtained by 
next-generation sequencing. This approach demonstrated the 
presence of heteroplasmy in both chloroplast and mitochon- 
drial genomes of Phoenix dactylifera L. in a range of 18 to 
25 SNPs in mitogenomes and one to eight SNPs in plastomes 
across various cultivars (Sabir et al., 2014). Structural het- 
eroplasmy of the plastome has been discovered in species 
within angiosperms, gymnosperms, and pteridophytes using 
long-read sequencing (Wang & Lanfear, 2019). It is interesting 
that most angiosperms have shown heteroplasmy with almost 
equal proportions of two different haplotypes (haplotypes A 
and B), and one plant, Selaginella tamariscina, contained a 
third haplotype (C). The detected infra-individual haplotypes 
differed by the orientation of the SSC subunit (Wang & 
Lanfear, 2019). Also, robust structural heteroplasmy was 
detected in Eleocharis species and it was not related to IR 
regions (Lee et al., 2020). So far, there has been no infor- 
mation about plastome structural heteroplasmy in liverworts 
and other bryophytes, and the use of nanopore long-read 
sequencing in this study also did not reveal any structural 
plastome rearrangements within the analyzed species. It 
seems that structural heteroplasmy caused by a difference 
in SSC orientation is limited to seed plants (Ciborowski et al., 
2022; Wang & Lanfear, 2019). 


Although the phenomenon of plastome heteroplasmy in vas- 
cular plants has been described by many scientists (Mandel 
et al., 2016; Sabir et al., 2014), there are no studies about 
heteroplasmy in bryophytes. The analysis of plastid genomes 
of four liverwort species representing all main evolutionary 
liverwort lineages indicates the presence of heteroplasmy in 
simple thalloid (Aneura pinguis) and leafy liverwort (Scapa- 
nia undulata). At this stage and with limited available PCR- 
free libraries, it is hard to conclude if molecular mechanisms 
behind heteroplasmy evolve after the divergence Pellidae and 
Metzgeriidae/Jungermanniidae lineages. 


4.3. Mitochondrial heteroplasmy in liverworts 


Heteroplasmy also occurs in plant mitochondrial genomes, 
mainly due to recombination among large repeats, which 
leads to changes in the structural organization of the mitoge- 
nome (Gualberto et al., 2014). Arabidopsis thaliana, Nicotiana 
tabacum, Triticum aestivum have shown recombinogenic 
repeats in numbers of two, three, and ten, respectively (Klein 
et al., 1994; Ogihara et al., 2005; Sugiyama et al., 2005). 
Except for circular genomes, linear, branched, and multi- 
chromosomal forms were also found in many species (Allen 
et al., 2007; Kozik et al. 2019; Szandar et al., 2022). In 
liverwort mitogenomes, despite the presence of some repeats, 
recombination events are very rare, currently only known 
from two species, the leafy Gymnomitrion concinnatum and 
the complex thalloid Dumortiera hirsuta (Kwon et al., 2019; 
Myszczynski et al., 2018). 


Some of the detected substitutions in the organellar genomes 
of Scapania undulata and Aneura pinguis share a common 
pattern of RNA editing, which is common and sometimes 
very intensive in liverworts (Dong et al., 2019; Myszczynski 
et al, 2017). RNA editing is a process that modifies tran- 
scripts from both organellar and nuclear genomes in var- 
ious organisms such as animals, plants, fungi, and pro- 
tists (An et al., 2019; Knoop, 2011; Riidinger et al., 2011; 
Teichert, 2018). RNA editing is a crucial mechanism that 
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supplements the central dogma by producing RNA products 
that differ from their DNA templates. The primary form of 
RNA editing in plants involves the conversion of C-to-U 
(canonical) or U-to-C (non-canonical) in mitochondria and 
plastids. This modification generates alternative amino acid 
sequences and is necessary for the proper functioning of 
some protein-coding genes (Ichinose & Sugita, 2016). In 
plants, RNA editing is more common in mitogenomes than 
plastomes (Ruwe et al., 2013), but the frequency and the 
type of edition are species-specific (Riidinger et al., 2012). 
Transcripts containing edited nucleotides can be transferred 
to organellar genomes via retroprocessing. According to 
multiple reports (Cohen et al. 2012; Derr & Strathern, 
1993; Fink, 1987), retroprocessing, also known as the reverse 
transcriptase-mediated model (RT-mediated model), is the 
most commonly observed mechanism for removing introns. 
This process involves the reverse transcription of spliced and 
edited mRNA, followed by the integration of the resulting 
intronless complementary DNA (cDNA) into the genome 
by homologous recombination (Cohen et al., 2012; Derr & 
Strathern, 1993; Fink, 1987). As a result, the loss of introns is 
accompanied by the loss of editing sites in the mitogenomes 
and plastomes, and this process is known from liverworts 
(Dong et al., 2019; Slipiko et al., 2017). 


This mechanism can explain the lack of heteroplasmy in Ric- 
cia fluitans, which, like other complex thalloid liverworts in 
non-editing species (Dong et al., 2019; Myszczynski et al., 
2019), as well as some transitions in other species, especially in 
Scapania undulata (Figure 2). Moreover, the evolutionary rate 
of complex thalloid organellar genomes is the slowest among 
main liverworts lineages (Sawicki et al., 2020; Villarreal et al., 
2016; Xiang et al., 2022), which could also impact intraindi- 
vidual variation. 


Apart from leafy S. undulata and simple thalloid Aneura 
pinguis, the heteroplasmy detected in the mitogenome of 
Apopellia endiviifolia is based on copy number variation 
of short repeats present in protein-coding and non-coding 
regions. Some studies suggest that this kind of copy number 
variation can be an artifact of PCR enrichment and PCR- 
based sequencing methods (Nakai et al., 2019), however, 
in our study, we used the native DNA sequencing method, 
which did not involve PCR steps. Generating short repeats 
is mainly explained by replication errors due to polymerase 
slippage and inefficient DNA repair processes (Fan & Chu, 
2007). Mitogenome evolution in Pellidae (the subclass of 
A. endiviifolia) is the fastest among liverworts (Paukszto 
et al., 2023), which could also be correlated with high muta- 
tion rates. In the case of mitogenomic heteroplasmy within 
Apopellia, the mechanism related to biparental inheritance 
could be excluded, since the plastome of this taxa lacks any 
intraindividual variation. 


5. Conclusions and future perspectives 


The nanopore sequencing is able to clearly image the phe- 
nomenon of heteroplasmy. Especially in plants, where both 
plastid and mitochondrial genomes show big numbers of 
single nucleotide polymorphisms. This study verified that 
heteroplasmy is also present in liverworts, not only in vascular 
plants. It also confirmed the conservative character of the 
liverwort’s plastome because no changes in structures and 
no structural heteroplasmy were found. However, further 
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studies are required to better understand the heteroplasmy of 
liverwort organellar genomes. Besides wider taxon sampling, 
including dioecious, monecious, and vegetative propagative 
species, transcriptomics and deep analysis of replication 
surveillance genes could help to explain this phenomenon. 
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